Structural insights into the SARS-CoV-2 Omicron RBD-ACE2 interaction

The global ﬁ ght against the COVID-19 pandemic is still in great uncertainty due to the emergence of SARS-CoV-2 variants, especially the variants of concern (VOCs) with changed patho-genicity, increased transmissibility and the resistance to con- valescent/vaccination


Supplementary
R393 Y505 A distance cut-off of 4 Å was used.

Supplementary Table 3 | The hydrogen bonds and salt bridges at the WT and
Omicron RBD-ACE2 interfaces WT   With the same tIC parameters, all subfigures show a highly similar appearance and conformational distribution, confirming that longer simulation time will not sample more conformations on the landscape. Thus, the current trajectory timescale is enough to explore the conformational space of the RBD-ACE2 interface. R498 and ACE2. In WT state 3, Q498 interacts with Q42, which similarly promotes its binding affinity (ΔG = -5.43 ± 2.35 kcal/mol).
Supplementary Fig. 6 The interactions between G/S496 of RBD with ACE2 in WT and Omicron systems. The residues in ACE2 within 5 Å to G/S496 are shown in sticks while interactions are shown in yellow dashed lines. The RBD of WT, the RBD of Omicron and ACE2 are shown in cyan, blue and salmon, respectively. In WT system, G496 in state 1 and state 2 show no interaction with ACE2, leading to weak contribution to the binding affinity (-0.95 ± 1.18 and -0.01 ± 0.11 kcal/mol, respectively). As for state 3, G496 interacts with K353 in its main chain, which leads to an increase of the binding ability (-2.05 ± 1.11 kcal/mol). As a contrast, in Omicron system, S496 interacts with D38 in all three macrostates and behaves a lower binding free energy.
Supplementary Fig. 7 The interactions between N/Y501 of RBD with ACE2 in WT and Omicron systems. The residues in ACE2 within 5 Å to N/Y501 are shown in sticks while interactions are shown in yellow dashed lines. The RBD of WT, the RBD of Omicron and ACE2 are shown in cyan, blue and salmon, respectively. In WT system, N501 also has no interaction with ACE2 in states 1 and 2 and shows weak energy contribution (-1.14 ± 0.79 and -0.74 ± 0.77 kcal/mol, respectively). Even in WT state 3, the energy does not decrease too much with the hydrogen bond to Y41 (-2.61 ± 1.28 kcal/mol). This phenomenon may be caused by the hydrophilic sidechain of N501 embedded in a hydrophobic environment made by the sidechains of K353 and Y51. As N501Y changes to a residue with a longer sidechain and a hydrophobic phenyl ring in Omicron system, the interaction with D38 is stable in each macrostate and the ring of tyrosine is suitable in the hydrophobic environment.

Protein expression and purification
The SARS-CoV-2 Omicron RBD and the N-terminal peptidase domain of ACE2 were expressed using the Bac-to-Bac baculovirus system (Invitrogen) as previously stated. Briefly, The SARS-CoV-2 Omicron RBD (residues Thr333-Gly526) with an N-terminal gp67 signal peptide for secretion and a C-terminal 6×His tag for purification was expressed using Hi5 cells and

Crystallization and data collection
Crystals were successfully grown at room temperature in sitting drops, over wells containing 0. Synchrotron Research Facility. Diffraction data were processed using the HKL3000 software [1] and the data-processing statistics are listed in Extended Data Table 1.

Structure determination and refinement
The structure was determined using the molecular replacement method with PHASER in the CCP4 suite [2] . The search models used included the ACE2 extracellular domain and SARS-CoV-2 RBD (PDB: 6M0J). Subsequent model building and refinement were performed using COOT [3] and PHENIX [4] , respectively. Final Ramachandran statistics: 96.95% favored, 3.05% allowed and 0.00% outliers for the final structure. The structure refinement statistics are listed in Extended Data Table 1. All structure figures were generated with PyMol [5] .

Surface plasmon resonance
Binding kinetics of ACE2 and SARS-CoV-2 RBDs were determined by surface plasmon resonance using a Biacore S200 (GE Healthcare). All experiments were performed in a running buffer composed of 10 mM HEPES, pH 7.2, 150 mM NaCl, and 0.005% Tween-20 (v/v). ACE2 was immobilized on a CM5 sensor chip (GE Healthcare) to a level of~500 response units. A 2-fold dilution series ranging from 50 to 3.125 nM of the SARS-CoV-2 WT RBD and omicron RBD were injected at a flow rate of 30 µl/min (association 60s, dissociation 180s), and the immobilized ACE2 was regenerated using 5mM NaOH for 10s. The resulting data were fit to a 1:1 binding model using Biacore Evaluation Software (GE Healthcare).

MD simulations
The Omicron RBD-ACE2 structure and WT RBD-ACE2 complex (PDB ID: 6M0J) were used to build the MD simulation systems. Besides, K493 in Omicron RBD-ACE2 structure was mutated to R493 to follow the recent sequence of Omicron. Therefore, two simulation systems, WT and Omicron were constructed. To keep the consistency among simulation systems, S19-D615 in ACE2 and T333-G526 in RBD were maintained in WT and the two Omicron systems.
FF19SB force field was applied to model the systems [6] . The initial structures were solvated in a truncated octahedral transferable intermolecular potential three point (termed as "TIP3P") water box with a buffer of 10 Å around it. Then, counterions Na+ or Cl-were added to the systems for neutralization and 0.15 mol•L -1 NaCl were added to solvents.
After construction, the systems were firstly minimized for 15,000 cycles with restraint of 500 kcal•mol −1 •Å −2 on the RBD and ACE2. Then, all atoms encountered 30,000 cycles of minimization.
Next, the systems were heated from 0 to 300 K in 300 ps and equilibrated for 700 ps with 10 kcal•mol −1 •Å −2 positional restraint on non-solvent atoms. Finally, the WT, and Omicron simulation systems encountered 8 parallel rounds of 200 ns production MD simulations, respectively. The timestep during MD simulations is 2 fs. During simulations, the temperature (300 K) and pressure (1 atm) was controlled by Langevin thermostat and Berendsen barostat, respectively. Long-range electrostatic interactions were treated by Particle mesh Ewald algorithm with a grid size of 1 Å, and a cutoff of 10 Å was employed for short-range electrostatic and van der Waals interactions.
The SHAKE algorithm was applied to restrain the bond with hydrogens. MD simulations were performed on Amber20, pmemd.cuda program.

Markov State Model (MSM)
Starting with the code base of the current stable version 3.8.0 of MSMBuilder [7] , we developed a more robust algorithm to describe the transition process in Markov state model. Our algorithm modifies the fixed lag time into a random one by a kernel function, which is further used to count transition matrix and build MSM model. As a consequence, the MSM model based on our algorithm exhibits a more robust and powerful representation ability for describing the protein conformational space. We will discuss this method in depth in our future publications.
From the trajectories of WT and Omicron system, the time-lagged independent component (tIC) analysis was firstly applied to decrease the dimension of the conformational space [8] . We selected the residue pairs, in which one residue was from RBD and the other was from ACE2. For each residue pair, we measured any pair of distances between the heavy atoms on RBD and those on ACE2 and kept the distances less than 5 Å as inputs for ContactFeaturizer. Then, the lag time of tIC analysis was set to 50 ns and 200 microstates were clustered by K-Centers algorithm. Multiple transition probability matrixes (TPMs) were further calculated according to the transitions among microstates. According to Eq. (1), the implied timescale test was performed to confirm the Markovian of microstates.
where τ represents the lag time for the TPMs, λi is the i-th eigenvalue of the TPM and τi is the implied timescale for the i-th relaxation of the MSM. As a function of the lag time τ, τi (especially τ1 for the slowest transition) is a constant when the transition among microstates are Markovian [9,10] . From the Markovian microstates, the macrostates were then clustered via the PCCA+ algorithm. Using transition path theory (TPT), the properties for transition, such as transition time and direction, were calculated [11] .

MM/GBSA calculation
MMPBSA.py plugin in AmberTools20 was applied to exploit Molecular Mechanics/Generalized Born Surface Area (MM/GBSA) in binding free energy calculations in balanced trajectories for WT, and Omicron simulation systems [12] . In total, the binding free energy (ΔGbind) of RBD towards ACE2 is expressed as equation (2).
Meanwhile, the second law of thermodynamics reflects that ΔGbind equals to enthalpy changes (ΔH) minus the product of entropy changes and temperature (TΔS), as equation (3) expresses.
The system conformation entropy (termed as "−T∆S") is usually estimated by normal mode analysis with a quasi-harmonic model, however, accurate estimation of the conformation entropy for the protein-protein interactions remains challenging. Notably, the item could be omitted here considering that the differences of enthalpy (termed as "ΔH") are large enough and the similarity of system conformation entropy among simulation systems [13] . Therefore, we omitted the calculation of the −T∆S term and only concentrated on the relative ordering of the free energy changes.
In the simulation process, ΔH is transformed into the sum of the molecular mechanical energies (ΔEMM) and the solvation free energy (ΔGsolv), according to equation (4).
ΔGnp was calculated based on the solvent-accessible surface area (SASA) in equation (7).
According to the GB model, SASA was calculated via LCPO and γ was 0.005.
The decomposition of the free energy into residues was subsequently carried out by the MMPBSA.py plugin [12] . During the decomposition process, 1-4 interactions were added to electric interactions and van der Walls interactions.